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Abstract 



rjQ , This paper proposes a non-Gaussian Markov field with a special feature: an explicit partition function. 

To the best of our knowledge, this is an original contribution. Moreover, the explicit expression of the 
partition function enables the development of an unsupervised edge-preserving convex deconvolution 
method. The method is fully Bayesian, and produces an estimate in the sense of the posterior mean, 
numerically calculated by means of a Monte-Carlo Markov Chain technique. The approach is particularly 
effective and the computational practicability of the method is shown on a simple simulated example. 
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I. Introduction 

The research concerning regularization for ill-posed inverse problems was first carried out by Phillips, 
Twomey and Tikhonov in the sixties and are compiled in [1]. For the specific problem of deconvolution 
they lead to the contributions of Hunt [2] based on toroidal models and fast implementation by FFT. 
These methods rely on quadratic penalization i.e., Gaussian laws in a Bayesian framework. The solutions 
thus formulated are linear w.r.t. the data and numerically efficient. However, their resolution is limited: 
the capability to properly restore sharp edges is limited. 
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At the beginning of the eighties, in order to overcome these limitations, Geman & Geman [3] (see 
also [4]) introduced a much superior Markovian field including hidden variables [5]. The hidden variables 
(also referred to as dual or auxiliary variable) are binary and interactive variables modeling sharp edges 
and closed contours. The data processing then relies on a detection-estimation strategy and allows the 
recovery of distinct zones with abrupt changes. The calculation of the solution in the sense of the 
maximum a posteriori is based on a simulated annealing algorithm which requires intensive numerical 
computations. For the sake of computational efficiency in some cases, Geman & Reynolds [6] and then 
Geman & Yang [7] introduced auxiliary (also referred to as dual) variables: the sampling of a correlated 
non-Gaussian field reduces to the sampling of a correlated Gaussian field for one part and to the sampling 
of a separable field for the other. Furthermore, the construction proposed by [7] is founded on the work 
of Hunt and the toroidal models: the sampling of the correlated Gaussian field reduces to the sampling 
of an inhomogeneous white Gaussian field followed by an FFT. The proposal below takes advantage of 
this construction. 

The case of fields with convex potential [8-13] (see also [14, 15]) was laid down in the nineties as 
fulfilling a compromise between the quality of the reconstructed images and the computational burden. In 
this framework, a particular attention has been paid to the case of L2 — Li potentials [9-13]: a quadratic 
behavior around the origin and a linear behavior at large values allow edge preservation. In this context, 
the constructions of [6] and [7] respectively led to two algorithms: ARTHUR and LEGEND [16] (see 
also [17]). The work presented here concerns this type of potential. 

With such potentials, the regularized solutions usually necessitate the adjustment of three hyperparam- 
eters: two parameters to control the law for the image and one parameter to control the law for the noise. 
Several attempts are dedicated to the question of hyperparameter estimation and the investigated solutions 
are frequently based on statistical approaches: (approximated or pseudo) likelihood, Bayesian strategies, 
EM and SEM algorithms. .. The reader may consult papers such as [18-24] and reference books such 
as [25, Part.VI], [26, Ch.7] or [27, Ch.8]. These approaches are potentially very powerful but they come 
up against a major difficulty: the partition function of existing a priori fields depends on hyperparameters 
and is not explicitly given. 

The first novelty of the paper lies in the fact that it proposes a new random field with an explicit partition 
function. To this end, the paper build an original type of compound (toroidal) field with L2 — Li potential. 
The work is largely inspired by the Bayesian interpretation of dual variables in terms of Location Mixture 
of Gaussian proposed by [28]. Moreover, it is also inspired by [29] (itself based on the contributions 
of Hunt [2] and Geman & Yang [7]). However, none of these contributions put forward the idea of a 
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field with an explicit partition function. Afterwards, the paper proposes a second novelty: a full Bayesian 
unsupervised (i.e., including hyperparameter estimation) edge-preserving convex deconvolution method, 
thanks to the knowledge of the partition function. It is based on a posterior law for the whole set of 
unknown parameters (including hyperparameters) and a Minimum Mean Square Error strategy. 

The paper is presented in the following manner. Section |]Tj introduces the notations and states the 
problem. Section |TFFJ is devoted to the construction of the proposed field, Section [TV] proposes its use 
for image deconvolution and demonstrates the numerical practicability. Conclusions and perspectives are 
delivered in Section [Vj Most of the calculations are explained in Appendices J] to I VIII I 

II. Notation and problem statement 

Work is carried out onPxP real images, with N = P 2 pixels, represented in a matrix form. a pq 
denotes the generic element of the matrix A, A^A) = ^ | a pq | its squared norm and A its FFT-2D. 

o 

The transformation is normalized: the Parseval relationship is written as A^(A) = A^fA) and the sum 
of the pixels is Yl a vi = P^aa- The symbols * and (g> respectively represent the circular convolution 
and the Schur product (termwise) of matrices. If F represents a circular filter and J an input object, the 

o o o 

output is written O = F ★ I in the spatial domain resulting in O = F (g) I in the Fourier domain. If 
fpq f° r ai l Pi Q> tne associated filter is invertible. 

In the subsequent developments about deconvolution, Y, X, H, and N respectively denote the 
observed data, the unknown object, the convolution matrix and the observation noise. With these notations, 
the observation equation is written: 

Y = H*X + N . (1) 

The deconvolution problem consists in recovering the unknown object X given the observed data Y and 
given the observation model H. The ill-posedness of the problem has been well identified for several 
decades and the problem is nowadays often tackled in a Bayesian framework using Markov priors. In a 
Gibbs form, the prior law writes: 

f x [X] = Kx 1 exp[-$ e (X)] , 

where Kx is the partition function (normalizing constant) and is the Gibbs energy controlled by 
a set of parameters (such as variance, threshold, scale, correlation length. . . ) collected in a vector 0. 
The general methodology is well-known: the solution is determined from the a posteriori law and a 
point estimate can be chosen as the mean or the maximizer, for instance. Anyway, the posterior law 
(and the point estimates) depends upon hyperparameters notably on the parameters of the prior 0. The 
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inference about these parameters can be attempted in a statistical framework whose keystone is an exact 
and explicit likelihood function (in an usual sense or in a posterior sense). This function is itself founded 
on a complete expression for the prior law including the partition function as it depends on 6. It is given 
as a large dimension integral: 



It is a commonplace to say that Kx{0) can be explicitly given for two well-known classes of 
(continuous state) field: 
(i) $ is quadratic, i.e., the field is Gaussian 
(n) $ is separable, i.e., the field is white. 

In other cases and especially for non-separable and non-Gaussian fields, the theoretical calculation and 
the numerical computation of Q are desperate tasks [25, p. 281] and they have never been achieved^. 
However, its achievement is made possible and simple in the next Section, for a special non-separable 
and non-Gaussian field. 



Taking advantage of (i) and (n) above, the proposed random field is a compound field involving two 
variables: a pixel variable noted as X and an auxiliary (or dual or hidden) variable noted as B. The 
joint law for (X, B) is defined by the law of X\B for one part and by the law of B for the other part. 
The former is a Gaussian component (case (i) above) and the latter is a separable component (case (ii) 
above). 

A. Toroidal Gaussian Field for X\B 



Let us consider two matrices B and F with /„_ ^ for all p, q and the toroidal (circular shift invariant) 
Gaussian field with a density parametrized in the form: 



where 7d > is an inverse variance. The matrix F designs the field structure and especially the 
neighborhood system and the form of the cliques. In the Fourier domain, the potential is separable 

'The partition function is however known for the Ising field [30]. It is a binary field out of the scope of the developed work. 




(2) 



III. Prior Field with Partition Function 



f x \ B [X\B}= K~ l exp [- 7d N 2 (F * X - B)/2] 



(3) 
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and naturally develops in two forms: 

N 2 (F*X-B) = N 2 (f®X-B^J 



Eo o 
I fpq x pq ~ bp 



|2 
u pq I 

pq 



E° 2 ° O O 

I fpq I I x pq ~ bpql fp 



|2 

J Pql J pq I 

pq 



which has three essential consequences for the following developments. 

o o o o 

1) The law for X is separable and each X pq is Gaussian with mean b pq /f pq and inverse variance 
7d|/ pg | 2 . As a result, the sampling of X reduces to the sampling of an inhomogeneous white 
Gaussian noise followed by an FFT-2D. 

2) The change of variable X = F ★ X is invertible, X is white and each X pq is Gaussian with mean 
b pq and common inverse variance 7<j. 

3) The partition function K x \b is easily tractable in the Fourier domain thanks to a change of variable 



K X \ B = f exp[- ld N 2 (F*X-B)/2] dX 

J~R. N 



pq\ 

and does not depend on B. 
In relation to existing works such as [7, 16,27-29], the main idea here is simply to focus on the case 
where the change of variable X = F*X is invertible (point 2 above) that is to say the number of cliques 
and the number of pixels are equal. 

Remark 1 — The partition function K x \b does not depend on B as a counterpart of a limitation: the 
number of cliques and the number of pixels are equal. As an illustration of the limitation, let us point 
out that K x \b depends on B for afield based on horizontal cliques plus vertical cliques (the number of 
cliques is greater than the number of pixels). 



B. Compound Field 

A separable and homogeneous field is then introduced for the auxiliary variable B with a density 
f B [B], product of the f B [b pq ]. The joint density is written as f XjB [X, B] = f x \ B [X\B] f B [B] and 
the marginal law is obtained by integrating the auxiliary variables: 

f x [X] = f f xl B[X\B]fB[B] dB. 
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Fig. 1. A sample of the field, with 7d = 7b = 1 (e is also set to 1). 



4000 
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Fig. 2. Histograms (image at Fig. [T). From top to bottom: histogram of image pixels X, histogram of auxiliary variables B 
and histogram of differences X. 



Since the partition function K x \s does not depend on B, the calculations can be achieved 



fx{X) 



K x\b I fB[B] exp[- 7d iV 2 (F*X-£)/2] dB 



K x\B II / fzibpg] ex P ~7d (x P q ~ bpq) 2 /2 



db 



pq 
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which involves a separable convolution product. 

Remark 2 — The proposed construction is possible for any probability density function fg. In this 
sense, it is possible to design a large class of potential functions. 

Thus, a wide range of law is available, but the convex potential case is the one of interest here, as 
mentioned in the introduction. So, the following property is of importance. 

Property 1 — For any log-concave probability density function fs, the probability density function fx 
is log-concave [31, Theo. 7], [32]. 




1 2 3 4 5 




Fig. 3. From top to bottom: potential function, first and second derivative. Solid line: Log-Erf potential tp(x) of Eq. ^ and 
dotted line: corresponding Huber potential of Eq. (T7J. The potential parameters are 7d = 7b = 1 and hence the equivalent Huber 
parameters are A ~ 0.32 and s ~ 1.56, according to Eq. JHJ. 



C. Laplace Law for Auxiliary Variables 

The following developments are dedicated to the case of auxiliary variables under a Laplace law 
suggested by [28]. 
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Fig. 4. A and a as a function of 7b, for a fixed 7d (7<j = 1) on a log-log scale. As expected, the plot essentially shows two 
linear behaviors and a critical case for 7b = \/2tv (log 10 v2tt ~ 0.4). 



As mentioned by [28] itself, among the Huber-like distributions, such a Laplace-convolved-Gauss prob- 
ability will have two main advantages: (i) the convolution involved in the marginal law fx (Section TlII-Bt 
will be made explicit and (ii) the sampling of auxiliary variables (Section [IV-CI ) will be directly feasible 
thanks to the inversion of the cumulative density function Fb\x- 

The Laplace law is written in the form: 

f B [B] = Kg 1 exp [- 7b Ni (B) /2] , (4) 

where 7b > is a scale parameter, and N\(B) = ^ \b pq \ is the Li norm. The partition function is 
simply calculated thanks to separability 

K B = f exp [- 7b iVi(B) /2] dB = [-y h /4]- N . 

According to © and (@]) the joint density for (X,B) takes the form: 

fx,B [X,B] = 

(5) 

K~] B exp [- 7d N 2 (F * X - B) + 7b N x (B) /2] 

and the partition function is explicit: Kx,b = K x \b Kb- 

The marginal law for X involves the one-dimensional convolution of a Gaussian density and a Laplacian 
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density. 



fx[X] 



pq 



exp 



R 



7d {x. 



pq w pqJ 



+ 7b \bpg\ /2 



db 



/"/ 



n / (+°°' 7d, 7b) 



where / is denned in Appendix HT1 Thus, the potential function ip appears: 



fx [X] = K X X B exp 



pq 



with: 



tp(x) = -21og/(+cx),a;,7 d ,7 b ) . 



(6) 



It is named the Log-Erf potential and it is shown in Fig. [3j The details of the calculations concerning 
this potential are given in Appendix |III] Concerning the first derivative, one has: 



tp'(0) = and <p'(+oo) 
and concerning the second derivative at origin, one has: 



7b • 



7b 



(r) y/n erfcx [77]) 1 — 1 



with 7] = 7b I \J 87a (erfcx [•] is given in Appendix H]). As expected (see Property [[J, this is a convex 
potential. It is a L2 — Li potential which can be reconciled with other more common L2 — Li potentials 
(Huber, log-cosh, hyperbolic, fair function). In the case of the Huber potential: 

„2 

11 I I ^ o 

(7) 



A < 



x 



2s I x I 



if I x I < s 
s 2 if I x I > s 



by identifying the second derivatives at zero and the slopes at infinity, one has: 



A 



1 



if" (0) and 



(8) 



2 r " if"{0) 

Compared Log-Erf and Huber potentials and their derivatives are shown in Fig. [3] Using the expan- 
sions (fT4l ) and (fT3T ) of Appendix Jl two limit cases can be identified, according to the value of the ratio 
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77: 

for 77 » 1 : A ~ 7d ; s ~ —— 

2 7 d 



tor 77 <C 1 : A ~ 7 b1 




2vr ' V 27 d 

In the two limit cases, on a log-log scale, there is linear behavior of A and s as a function of 7b, for a 
fixed 7d (see Fig. 0]). The intersection of the two linear behaviors can be identified as a critical behavior 
for 7b = ^/27T7d. The critical value will be used for the initialization of simulations of Section ITV-DI (see 
also Appendix IVIIII) . 

D. Practical Case 

In practice, the field is based on a 3 x 3 Laplacian filter, defined by [0,1,0; 1 , —4 , 1 ; 0,1,0] and 

o 

represented by the matrix D. At null frequency one has d 00 = and as a consequence the mean level of 
the image is not managed. So, an extra parameter is introduced to drive the mean level: it is denoted by 
£ (e > 0) and the characteristic matrix F is set to F e = D + e. 

Remark 3 — If e = the field cannot be normalized and each clique is formed from the four nearest 
neighbors ( cross-like clique ). If e 7^ 0, the field can be normalized and each clique is spread out over 
the entire image. 

The following developments take e > and the partition function of the joint field writes: 
K- x ]b = S e 7 d AV2 7^ , with 5 = (32vr)^/ 2 JJ \d p 



^(0,0) 



Fig. Q] gives a sample of the field with 7d = 7b = 1 and Fig. [2] gives histograms of the image pixels, 
the auxiliary variables B (a Laplace histogram) and the differences X (an over-Gaussian histogram). 

Remark 4 — It is noteworthy that the marginal model X is homogeneous, but the conditional model 
X\B is non-homogeneous (except if all the b pq are equal). 

IV. Deconvolution 

As a result of the previous Section, a new random field is now available with a special feature: an 
explicit (and simple) partition function. In the present Section, the field serves as a prior in a deconvolution 
method whose specificity is to be unsupervised {i.e., including hyperparameter estimation). More precisely, 
the method relies on a full Bayesian framework and the solution is determined from an a posteriori law 
based on an a priori law (given below) for the object, the noise and the hyperparameters. 
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A. Prior choices 

1) Object law: The a priori field is defined in the previous section. The joint density for (X,B) is 
given by §5$ and it is driven by three parameters: 7<j, 7b and e. 

2 ) Noise law: The present work is founded on the usual case of zero-mean white Gaussian noise with 
inverse variance denoted 7 n . The density is written: 



3) Hyperparameter law: Four parameters are to be managed: j n , 7d, 7b and e. The three parameters 
of major importance are 7 = [7 n , 7d, 7b]; the fourth parameter e drives the prior mean level of the image 
and it is considered as a nuisance parameter. Anyway, very few is a priori known about these parameters 
and the idea is to use non-informative or diffuse and separable priors. 

• The proposed prior law for the three parameters 7 n , 7^ and 7b is a conjugate law. It is a gamma law 
(see Eq. (fl5l ). Appendix JV]) with parameters respectively denoted (a n ,/? n ), (ay, (3d) and (ab, AO- 
It allows for easy computations with the posterior law. Moreover, it includes diffuse and non- 
informative prior: the uniform prior and the Jeffrey's prior are obtained as limit cases for (a,/3) = 
(l,oo) and for (a, (3) = (0,oo) respectively. 

• The last parameter e is considered as a nuisance parameter and the proposed strategy resorts to 
integration out. The desired prior law is a Dirac law, so that no information is accounted for about 
the mean level of the image (it is set on the basis of observed data only). Formally, in a first step 
a uniform density over [ 0, M £ ] is introduced and in a second step the limit law for M £ — > is 
considered. 

B. Joint Law 

Thus, the joint law is established for (y,X,B,C,£): 



MN) = (2tt) 



N/2 



7^ /2 exp[- 7 n N 2 (N)/2] . 



fy,X,B,C,£{ Y i X ' B il,£) 



x , a n -l+A72 Qa-l+AT/2 Qb -1+AT „ Twr-l-n / \ 

« 7n 7 d 7 b e M e l[0,JWe]( £ ) 



exp - {Q £ /2 + 7n //3 n + 7d //3 d + 7b//?b} 
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where 5' = (2ir) N/2 5/(3^ T[a n ] T[a h ] /3j d T[a d ] is a normalization constant and Q £ is part of the 
Co-logarithm of the density involving F £ : 

Q £ = 7n N 2 (Y - H * X) + 7d N 2 (F £ * X - B) + 7b N x (B) . 

The a posteriori density is formed for X, B, C and £ , given y thanks to the Bayes rule: 

fx,B,c,£\y(X,B,-f,s\Y) = 

fx,B,c,s,y{X,B^,e,Y) 



[ fx.B.c,e.y{X, B, 7 , e, Y) dX dB d 7 de ' 

JX,B,~f,e 

and it is parametrized by the (a,0) and M £ . Then, e is integrated out and the law for X, B, C given 3^ 
writes 

f XtBtC \y(X,B,j\Y) = Jj x ^c, £ \y(X,B n ,e\Y)de. 

It is also parametrized by the (a, j3) and M £ , so, the limit is set when M £ tends to 0. The detail of the 
calculations is given in Appendix [V] and it is shown that a probability density function is obtained if the 

o 

mean level of the object is observed, i.e., h 00 ^ 0. 

C. Posterior Law and Posterior Mean 

Thus, the Total Posterior Law can be deduced for all the unknown parameters (X,B,C) given the 
observed data y: 

fx,B,c\y(X,B,j\Y) oc 

a a -l+N/2 ad -l+N/2 a b -l+N 
7n 7 d 7b (9) 

exp - {Qq/2 + 7n //3 n + 7d //3 d + 7b //5 b } 



where Qq involves Fq = D: 

Qo = 7n N 2 (Y - H * X) + 7d N 2 (D * X - B) + 7b N x (B) . 

In practice, the chosen point estimate is the posterior mean (i.e., the Minimum Mean Square Error). 
Its calculation is performed by means of Monte-Carlo Markov Chain stochastic sampling algorithm [25, 
33]: auxiliary variables, object and hyperparameters are successively sampled given the other in a Gibbs 
strategy. 
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Data 


PM 


CPM 


CPM 


CPM 


CPM 


MAP-LogErf 


MAP-Huber 










(best 7b) 


(best 7 d ) 


(best 7 n ) 






Dist. L2 


11.62% 


3.93% 


3.94% 


3.87% 


3.85% 


3.81% 


5.56% 


5.63% 


Dist. LI 


35.47% 


19.47% 


19.50% 


18.92% 


19.40% 


19.18% 


20.68% 


20.84% 



TABLE I 

Quantitative comparison by means of L2 and LI distances between true image and data (column 1), true 

IMAGE AND ESTIMATED IMAGES (COLUMN 2 TO 8). 



1) Sampling auxiliary variables: The sampling of auxiliary variables is delicate but can be directly 
done. It is based on the inversion of the cumulative density function (cdf) F B \ X . It is sufficient to uniformly 
sample u in [0, 1] and to compute b = F~^ x {u). The calculations can be found in Appendix [VT1 

o 

2) Sampling object: The object is a toroidal Gaussian field and the X pq are independent with mean 
fi pq and inverse variance v pq (see calculations in Appendix I VII I ) 

Vpq = 1n\h pq \ 2 +Jd\°dpq\ 2 (10) 



pq 



7n h pq y pq + 7 d d pq b pq / v pq (11) 



where superscript * stands for the complex conjugate. Thus, the sampling is reduced to the sampling of 
an inhomogeneous white Gaussian noise followed by an FFT-2D. 

3) Sampling hyperparameters: Each parameter 7 n , 7d and 7b follows a gamma^ law derived form (O 
(see Appendix ITVl ) with respective parameters a and (3 

a = a n + N/2 and /T 1 = fa 1 + N 2 (Y - H * X)/2 
a = a d + N/2 and = + N 2 (D * X - B)/2 

a = a h + N and = (3^ + N\(B)/2 . 

The description of the method and the algorithm are now complete and synthesized in Table [TT] The 
remainder of this Section illustrates the implementation practicability. 

2 The sampling of the Gamma variables is achieved using the Matlab function gamrnd. 
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20 40 60 80 100 120 20 40 60 80 100 120 20 40 60 80 100 120 20 40 60 80 100 120 

Fig. 5. From left to right : original image X*, observed data Y, deconvolved image Xpm and deconvolved image Xmap- 
At the top: gray level images and at the bottom: profile of the 100-th row (which encroaches on both the rectangle and the 
rhombus). In order to evaluate the relative dynamics in each case, all the images are shown in the same gray-scale between -0.5 
and 2. The four shown profiles are also presented between -0.5 and 2. 




D. Computation feasibility 

This part illustrates the previous developments and it only aims at demonstrating the numerical 
practicability of the method. It is built on a deliberately simple image X* appropriate in order to evaluate 
the capabilities and the limitations of the proposed approach: the image is set up from homogeneous 
zones separated by sharp edges (see Fig. [5J on the left). It is a 128 x 128 image composed of a black 
background and three objects with gray levels gradually changing between 0.7 and 2.1. The difference 
between neighboring pixels varies between and 2. 1 in absolute value. Regarding the Laplacian of the 
image, X = F*X, the set of can be split in two sets: 94% of the X^ are less than 2.10 -4 (inside 
homogeneous zones) and 6% of the X^ are greater than 3.10~ 2 (located around edges). No value is 
between 2.1(T 4 and 3.1(T 2 . 

The impulse response of the system is Gaussian shaped with 6 pixels width at half-maximum, the 
noise variance is 2.10~ 2 and the resulting observed image Y is shown in Fig. [5J(in the second column). 
The resolution is clearly degraded and details of the edges are no longer visible (neither on the gray 
level image nor on the shown profile). The dynamic is also strongly affected, notably at about the 64-th 
sample of the shown profile. 

The procedure is initialized by the empirical least-squares hyperparameters given in Appendix IVIIII 
The object X is initialized by the observed data (and there is no need to initialize the auxiliary variables). 
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Initialize 

• N = 1 , Delta = inf 
. HatX = Data 

• GamN, GamD, GamB (Annex [VTIlt 

Repeat 

• Sample step 

- Auxiliary variables B (Sect. HV-CTt 

- Object X (Sect. |IV-C.2t 

- Hyperparameter GamN, GamD, GamB (Sect. lIV-C.3"l > 

• Update 

- N = N+l 

- Delta = ( HatX - X ) / N 

- HatX = ( (N-l)*HatX + 1*X ) / N 
Until Delta<eps 

TABLE II 

Detailed algorithm (pseudo-code). 



Moreover, practically, the (a*, /?*) are set to (0, oo) corresponding to the Jeffrey's prior. 

The proposed algorithms generates samples of the a posteriori law fx,B,C\y(X , B, -y\Y). Practically, 
the algorithm behaves as expected: the stationary law is attained after a burn-in time (about 200 iterations) 
and remains in a steady behavior. The empirical mean of the generated images is recursively computed 
and the algorithm is stopped when its variation becomes smaller than a given value T (in quadratic norm). 
In the presented example T = 5.10 -4 , the algorithm produced 953 iterations and computation time was 
47 seconds. 

The resulting generated hyperparameters 7b, 7d and 7 n are shown in Fig. [7] The left part of the figure 
shows the 953 iterates of the three parameters: after about 200 iterations the three parameters are stabilized 
and seem to be under the stationary law of the chain. The empirical mean value (approximating the 
Posterior Mean) of the parameters respectively are 7b = 2.88 10 2 , 7d = 5.91 10 4 and 7^ = 1.99 10 3 . The 
iterates are also shown on the right hand side of Fig. |7]as histograms: they are clearly very concentrated 

3 The proposed algorithm has been implemented with the computing environment Matlab on a PC, with a 2 GHz AMD-Athlon 
CPU, and 512 MB of RAM. Code is ~ 100 lines long. 
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Fig. 6. Distances between the true image X* and Conditional Posterior Mean Xcpm as a function of the parameters 7b, 7d 
and 7 n , around the posterior mean value 7. From left to right: error is shown as a function of 7b, 7d and 7 n , Top row gives 
L2 distance and bottom row gives LI distances. The black dots give the minimum distances reported in Table U 

around the Posterior Mean (with small variance), i.e., the marginal law for the hyperparameters are 
quasi-Dirac distributions. 

Considering the numerical value, in the sense of Eq. ([8]), the equivalent regularization parameter is 
A = 2.17 10 1 and the equivalent threshold is s = 6.67 10~ 3 . It is noticeable that the threshold value 
correctly split the Xk in two sets (less than 2.10 -4 - greater than 3.1CT 2 ). The point is that the method 
automatically adjusts hyperparameters to correctly separate the X^. This is a first argument in favor of 
the proposed strategy in order to tune the threshold of an L2 — Li Gibbs potential. 

The resulting image is shown in Fig. [5] (on the third column). The effect of deconvolution is notable 
on the image in gray level as well as on the shown profile. The three objects are correctly positioned, 
the orders of magnitude are respected and the zero level is correctly reconstructed: it can be seen on the 
entire image and in particular on the shown profile. The dynamic is also correctly restored: this aspect is 
notable on the shown profile around the maximum (64-th sample). The true dynamic occupies the range 
0-1.9 whereas the dynamic of the observed data scarcely exceeds - 1.4: the proposed method restores 
the dynamic to - 1.88 that is to say 99% of the original variation. 

A global quantitative comparison has been achieved by evaluating (i) the distance between original 
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Fig. 7. Monte-Carlo Markov Chain for the three hyperparameters generated by the proposed Gibbs sampler. From top to 
bottom: 7b, 7d and 7 n . The left part of the figure shows the samples as a function of iteration index and the right part of the 
figure shows the samples as histograms. 



image X* and observed data Y and (ii) the distance between original image X* and estimated image 
X PM . The considered distances are normalized L2 and LI distances. The main results are listed in TableJl 
first and second columns and show an improvement by a factor 2.95 (11.62% to 3.93%) for L2 distance 
and a factor 1.82 (35.47% to 19.47%) for LI distance. 

In order to deepen the numerical study, a second estimate has been computed: Xqpm the Conditional 
Posterior Mean (CPM), i.e., the mean of the conditional posterior law fxmyfi{X,B\Y,~f). -Xcpm is 
clearly a function of the hyperparameters 7 and a twofold evaluation is proposed. 

• The first estimate is the one obtained with 7 = 7. Practically, the marginal estimate X PM and 
the conditional estimate _Xcpm(7) are quasi-equal; this is due to the fact that the marginal law 
for the hyperparameters are quasi-Dirac distributions. Quantitatively, regarding L2 distances, the 
PM produces 3.93% whereas the CPM produces 3.94%; regarding LI distances, the PM produces 
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19.47% whereas the CPM produces 19.50%. In both cases, the modification is almost imperceptible. 
• The measurement of errors has also been explored for the CPM as a function 7b, 7d and j n , around 
the posterior mean 7. Results are given on Fig. [6j in each case, smooth variation of distances is 
observed when varying parameters and an optimum is visible. It is reported on Table U and shows 
almost imperceptible modification: optimization of the hyperparameters (based on the true image 
X*) allows negligible improvement (smaller than 0.1% for L2 error and smaller than 0.5% for 
LI error). So, the main conclusion is that, the unsupervised proposed approach is a relevant tool in 
order to tune parameters: it works (without the knowledge of the true image) as well as an optimized 
approach (based on the knowledge of the true image). 
Finally, a third estimate has been computed: the Maximum A posteriori (MAP). It has been computed 
for the LogErf and the Huber potentials. Both of them have been computed with equivalent hyperparame- 
ters (given above): (7b, 7d, 7u) for the LogErf potential and (A, s) for the Huber potential. The two MAP 
solutions (LogErf and Huber) are visually indiscernible: this is expected from so similar potential. The 
results are presented in Fig. |5J right column: the estimated map suffers from cross-like artifact, due to 
the cross-like structure of the neighborhood system. Quantitatively speaking, the measurements of errors 
are given on Table QJ LogErf and Huber produce almost similar errors. Moreover, the errors are greater 
than the one produced by the PM and the CPM. 

The restoration is nevertheless imperfect and of limited resolution: the sharp edges remain slightly 
smoothed and limited in amplitude. The ringing effect also affects the quality of the deconvolved image. 
This diagnostic is long awaited in the framework of convex deconvolution. Anyway, the important point is 
not so much the property of the deconvolved image itself (intrinsic of any convex deconvolution) but the 
(new) practical capability to automatically tune the hyperparameters. Moreover, the potential improvement 
is certainly wide considering more heavy-tailed law for the auxiliary variables, as explained in the next 
section. 

V. Conclusion 

This paper presents a twofold novelty in the field of statistical image reconstruction and restoration. 

1) The partition function is explicitly given for a specific non-Gaussian Markov field, with an L2 — Li 
Gibbs potential. It is built as a compound field involving: an auxiliary variable following a separable 
Laplace distribution and a pixel variable following a Gaussian distribution given the auxiliary 
variable. 
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2) An unsupervised deconvolution method is deduced, based on the exact likelihood taking advantage 
of the knowledge of the partition function. The method is fully Bayesian, and the point estimate 
is the posterior mean computed thanks to a Monte-Carlo Markov Chain technique. 
The paper focuses on the deconvolution problem, but it is also possible to deal with simpler questions 
than deconvolution: parameter estimation from direct observation of the field, edge enhancement or 
denoising. 

Moreover, the paper relies on Gaussian noise, but the case of non-Gaussian noise is also envisaged, 
in particular the use of robust norms to reject abnormal data (outliers). To this end, a separable version 
of the L2 — Li proposed field could be suitable as a law for noise measurement. 

The proposed method can be directly applied in the case of large support operator, e.g., reconstruction 
problems such as Fourier synthesis [34]. The proposed methodology also remains valid for other linear 
model and the required modification concerns the sampling of the object. It remains Gaussian but its 
sampling is no longer possible in a single step for the entire image by FFT-2D. The Gibbs sampling 
techniques constitute an adapted tool but the calculation time would be (maybe dramatically) extended. 
For non-linear problems, the law for the object is no longer Gaussian and a case by case study is required. 

Concerning the a priori field, other laws for auxiliary variables are certainly desirable. The possible 
improvements are numerous considering more heavy-tailed law in order to overcome the limitation of the 
convex deconvolution. The methodology still remains valid but the difficulty then concerns the sampling 
of the auxiliary variables. The direct sampling by inversion of the cumulative density function may not 
be possible, however, the rejection or the Hastings-Metropolis algorithms could be used to overcome this 
difficulty. 

In the case of myopic deconvolution, it is also conceivable to estimate (part of) the parameters of the 
observation system. Here again, a case by case study is necessary, but the delicate question of the system 
parameter sampling can probably be tackled by means of rejection or Hastings-Metropolis algorithms. 

Appendix I 

ERF, ERFC, ERFCX 
The erf function is defined for x G R by: 




(12) 
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and ierf denotes the reciprocal function. Elsewhere, erfc [x] = 1 — erf [x] and erfcx [x] 
Concerning the latter, there are the following expansions: 

erfcx [x] ^~ [l — x -2 ] / (x\/tt) 

erfcx [x] ~ 1 — 2x/ t/it . 

and the derivative erfcx [x] = 2 x erfcx [x] — 2/^/tt. 



exp [x 2 ] erfc [x] . 

(13) 
(14) 



Appendix II 
Gauss and Laplace convolution 

Considering the calculations, a large part of the proposed developments is based on the convolution 
of a Gaussian function and a Laplacian function. 

A. Preliminary Calculi 

For xq > and x E R, write: 

J(x ,x,d,b) = / exp [- {d(y - x) 2 + by) /2] dy , 
Jo 

simply written as J(xq,x) when there is no ambiguity. On rewriting the argument of the exponential, 
we have: 

d(y -x) 2 + by = d [(y - x) 2 + (x 2 - x) 2 ] 
with x = x — b/2d. The change of variable t = (y — x) \J d/2, yields: 
J(xo,x) = \J Ti/2d exp [6 2 /8cZ] exp [— bx/2] 



| erf x a/ d/2 



erf 



[X - Xq) 



d/2]} 



where the function erf is defined by (fT2l . In particular, one has: J(0, x) = and 



J(+oo,x) = v / TT/2d exp [b 2 /8d] 

exp [— bx/2] erfc — x \f d/2 
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B. Convolution 



For xq, x 6 R, write: 



I(x ,x,d,b) = 



exp[-{%-x) 2 + %|}/2] dy 



— oo 



written simply as I(xo,x) when there is no ambiguity. By the change of variables y' = —y, y' = by, 
and y' = yVd, it is shown that: 

I(+oo, x, d,b) = I(+oo,—x,d,b) 
I(xo,x,d,b) = I(bxo, bx, d/b 2 , 1) / b 



It can thus be deduced that: 

I(xo,x,d,b) = J(+oo, — x, d, b) — J{— xq, — x, d, b) 
I(xo,x,d,b) = J(+oo,—x,d,b) + J(xo,x,d,b) 

respectively for xo < and xo > 0. These relationships are useful for the study of the potential function 
(next Appendix) and for the inversion of the cdf of B\X (Appendix IVD). 

Appendix III 
Log-Erf Potential function 

According to the results of the previous Appendix the potential function of the marginal field X, 
Eq. ©, Section UlI-CI is written: 



I(x ,x,d, b) 



I(xoVd,xVd, l,b/y/d) / Vd 



(p(x) 



21og/(+oo,x,7 d ,7 b ) . 



By putting: x( x ) = exp[7bx/2] erfc (p + x) \/7d/2 , p = 7b/27d the potential function can be 



written: 



tp(x) 



2 log [x(x) +x(-x)] 



up to additive constants. The derivation shows that : 
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and it can easily be deduced that: 

Lp (x) = -2 = - 7b r 

X(x) + X(-x) X(x) + X(-x) 

and in particular 

ip'(0) = and <p'(+oo) = j h . 
Moreover, concerning the second derivative at origin 

✓ (0) = -2^ = -7 b X ' (0) 



x(o) x(o) 

7] ypH erfcx [rj]) 1 — 1 



with 7] = 7b/\/87d- 



Appendix IV 
Gamma Probability Density Function 

The gamma probability density function is parametrized by a > and j3 > in the form: 

fy(x;a,0) = -^-r x"- 1 exp[-x//3] l R+ (s), (15) 

where 1 R+ is the indicator function of R + . The expected value is a/3, the variance is a (3 2 and it is 
maximal for x = (3 (a — 1) in the case a > 1. 

Appendix V 

INTEGRATION OF HYPERPARAMETER 

A. Preliminary Result 

Given a function / : R — ► R + , C°° and assume that g(x) = xf(x) can be integrated. By integrating 
from to M the Taylor expansion of g[x) at origin, one shows that: 

Then, give a function ifi : MP x R — > R + such that eifi(u,e) can be integrated over K.G+ 1 . By 
using ( fl6l) , it can be seen that: 

AT 

M > r ,} d7) 



J J 

Jv JO 



7] ip(v,r))dr]dv / V>O,0) dv 



provided that tp(v,0) can still be integrated over R Q . 
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B. Posterior Law 

The a posteriori law (Section II V-B b for X, B, C and £ given y (parametrized by the coefficient M £ ) 
is written, after simplification by S'M e : 

ip(u,e) de 

e=0 



[ [ ° if>(v,e) de d 7 dX dB 

JXBf Je=0 

ters X, B, 7 and: 

exp- {Q £ /2 + 7n //3 n + 7d //3 d + 7b //3 b } . 

To apply the relationship (fTTl) . it is sufficient to ensure that rp(v,0) can be integrated. Since the norms 
in H N are equivalent, k G R+ can be found such as Ni(B) < kN2(B) for all 5. Thus the integrand 

o 

can be majored by a Gaussian integrand and convergence ensured if and only if h 00 ^ 0. 
In the limit, when M e — ► 0, we have the result I©. 

Appendix VI 
Inversion of B\X cdf 

The sampling of auxiliary variables (Section IIV-CI) given the object is based on the inversion of cdf 
of B\X. For u e [0, 1]: 

u = F B \ x {b) = fs\x= 77- 7 T • (18) 

is to be resolved. In order to solve this equation, write p = 7t>/27d 



IXB-y Je=0 

where u represents all the parameters X,B,*y and: 



exp [+ 7b 6/2] erfc (p + 6) VTd/2 



6+ = exp [- 7b 6/2] erfc [(p - 6) VW2 

and 9 = 9- + 9 + . Moreover s = Fg|#(0) = 9-/9 and u = 9u — 9-. Equation ( fT8l ) is resolved differently 
depending on whether u < s (i.e., 6 < 0) or u > s (i.e., 6 > 0) and yields: 



6 = 6 + p - ierf {T_} a/2/ 7c1 if u<s 



6 = 6- p - ierf {T + } VVTd if « > » 
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where ierf {•} is defined in Appendix U and 



T_ = erf 
Ti = erf 



(6 + p) VW2 - u exp [- 7b 6/2] 



(6 - p) VW2J - « exp [+ 7b 6/2] . 
Thus it is possible to sample B\X simply from u uniformly distributed over [0,1]. 

Appendix VII 
Conditional posterior law for X 

The posterior law (X,B,C\y) given by Eq. (© in Section ITV-CI involves 

Qo = 7n N 2 (Y - H * X) + 7d N 2 (D * X - B) + 7b (B) , 
and the conditional posterior law 3-0 required to sample object in Section ITV-C.2I involves 

Q 0] = ln N 2 (Y - H * X) + ld N 2 (D * X - B) . 

In the Fourier domain: 

Qo\ = 7nN 2 (Y-H®X) + ld N 2 (D®X-B) 

Eo ° ° 2 °° °2 

7n 1 — h"pq x pq\ "T" 7d l^pg^pg — ^pgl 



that is to say a separable summation. Moreover, it can be rewritten and identified to a sum of quadratic 
terms: 



Qo\ — ^ ] v pq\ x pq l l pq\ 



_ ?. |2 

^pq I £pq 

pq 

with u pq and p pq given in Eq. (fTOl - (fTTT) . 



Appendix VIII 
Empirical Least Squares Hyperparameters 

The initialization of the algorithm is based on second order statistics of the analyzed data, in the Fourier 

o 

domain. Considering the structure of the a priori field and the noise, for all (p, q), such as / ^ one 
has: 

o o o o o / 

X pq = V^d Gpq/fpq and N pq = G pq 
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where G pq and G pq are two independent zero-mean white Gaussian noise with unitary variance. Moreover, 

l—i ° O O o o 

considering the observation equation (Q]), one also has Y pq = h pq X pq + N pq . With Z pq = \Y pq \ and 

o o 

r pq = | h pq \/\f pq \, we have: 

r ° ~\ 

E [Z pq \ = r d r pq + r n . 

Thus the parameters and r n can be selected at the minimum of the least squares criterion: 

J(Td,r n ) = z~^{z P Q ~ r d r pq + r n ) 2 . 

pq 

It is found that: 

7 — a5 (55 — 07 

rd = a 2 and rn = ~5 T 

p — a z p — a z 

with Na = r pq , N(5 = ^ r 2 pq , = ^ r pq z pq , and A^d = ]T z pg . These values for rd and r n are 
used to initialize the proposed algorithm (Section [IV-CI ): 7d = 1/Vd and 7 n = l/r n . The third parameter 
7b is initialized at the critical value: 7b = ^/2irjd- 
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